Dependency Analysis of Japanese Spoken Language via SVM
نویسنده
چکیده
This paper discuss a dependency analyzer employing Support Vector Machines (SVMs) for Japanese spoken language. Most conventional dependency analyzers target written texts. Thus, we use a currently available spoken language corpus and make the SVMs learn the corpus to build a dependency analyzer that targets spoken language. We used two types of corpora: one contains written language, and the other, spoken language. By repeating closed testing procedures via the SVMs, we cleaned the corpora. In addition, the cleaning raised the accuracy of the dependency structure analysis by two-fold cross validation over 1% for written language, and 0.25% for spoken language.
منابع مشابه
Incremental dependency parsing of Japanese spoken monologue based on clause boundaries
In applications of spoken monologue processing such as simultaneous machine interpretation and real-time captions generation, incremental language parsing is strongly required. This paper proposes a technique for incremental dependency parsing of Japanese spoken monologue on a clause-by-clause basis. The technique identifies the clauses based on clause boundaries analysis, analyzes the dependen...
متن کاملStochastic Dependency Parsing of Spontaneous Japanese Spoken Language
This paper describes the characteristic features of dependency structures of Japanese spoken language by investigating a spoken dialogue corpus, and proposes a stochastic approach to dependency parsing. The method can robustly cope with inversion phenomena and bunsetsus which don’t have the head bunsetsu by relaxing the syntactic dependency constraints. The method acquires in advance the probab...
متن کاملExamining the difficulty pathways of can-do statements from a localized version of the CEFR
The Japanese adaptation of the Common European Framework of Reference (CEFR-J) is a tailored version of the Common European Framework of Reference (CEFR), designed to better meet the needs of Japanese learners of English. The CEFR-J, like the CEFR, uses illustrative descriptors known as can-do statements, that describe achievement goals for five skills (listening, reading, spoken ...
متن کاملSimultaneous English-Japanese Spoken Language Translation Based on Incremental Dependency Parsing and Transfer
This paper proposes a method for incrementally translating English spoken language into Japanese. To realize simultaneous translation between languages with different word order, such as English and Japanese, our method utilizes the feature that the word order of a target language is flexible. To resolve the problem of generating a grammatically incorrect sentence, our method uses dependency st...
متن کاملThe Use of Prosody in Japanese Dependency Structure Analysis
Natural language processing has traditionally relied solely on information extracted from written materials. Recently, as part of natural language processing and speech processing merge into spoken language processing, a new possibility is opening up to exploit prosodic information, which is lost when utterances are transcribed into letters or characters, for language processing. Such a possibi...
متن کامل